Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 70995 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 7.6 MiB |
| Average record size in memory | 112.0 B |
Variable types
| NUM | 12 |
|---|---|
| CAT | 2 |
odometer is highly skewed (γ1 = 35.00114843) | Skewed |
df_index has unique values | Unique |
manufacturer has 747 (1.1%) zeros | Zeros |
condition has 35365 (49.8%) zeros | Zeros |
fuel has 3457 (4.9%) zeros | Zeros |
type has 19443 (27.4%) zeros | Zeros |
paint_color has 13268 (18.7%) zeros | Zeros |
Reproduction
| Analysis started | 2020-10-10 16:21:53.583697 |
|---|---|
| Analysis finished | 2020-10-10 16:22:31.400535 |
| Duration | 37.82 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 70995 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 42740.91792 |
|---|---|
| Minimum | 0 |
| Maximum | 85319 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 554.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4184.7 |
| Q1 | 21471.5 |
| median | 42640 |
| Q3 | 64067.5 |
| 95-th percentile | 81184.3 |
| Maximum | 85319 |
| Range | 85319 |
| Interquartile range (IQR) | 42596 |
Descriptive statistics
| Standard deviation | 24679.43159 |
|---|---|
| Coefficient of variation (CV) | 0.577419316 |
| Kurtosis | -1.19866326 |
| Mean | 42740.91792 |
| Median Absolute Deviation (MAD) | 21302 |
| Skewness | -0.001148208366 |
| Sum | 3034391468 |
| Variance | 609074343.7 |
| Monotocity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 53903 | 1 | < 0.1% | |
| 57993 | 1 | < 0.1% | |
| 64138 | 1 | < 0.1% | |
| 62091 | 1 | < 0.1% | |
| 51852 | 1 | < 0.1% | |
| 49805 | 1 | < 0.1% | |
| 55950 | 1 | < 0.1% | |
| 14994 | 1 | < 0.1% | |
| 29339 | 1 | < 0.1% | |
| Other values (70985) | 70985 | > 99.9% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 85319 | 1 | < 0.1% | |
| 85318 | 1 | < 0.1% | |
| 85317 | 1 | < 0.1% | |
| 85315 | 1 | < 0.1% | |
| 85314 | 1 | < 0.1% |
price
Real number (ℝ≥0)
| Distinct | 3795 |
|---|---|
| Distinct (%) | 5.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12367.47668 |
|---|---|
| Minimum | 1050 |
| Maximum | 39999 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 554.6 KiB |
Quantile statistics
| Minimum | 1050 |
|---|---|
| 5-th percentile | 2895 |
| Q1 | 5990 |
| median | 9950 |
| Q3 | 16578 |
| 95-th percentile | 30910.2 |
| Maximum | 39999 |
| Range | 38949 |
| Interquartile range (IQR) | 10588 |
Descriptive statistics
| Standard deviation | 8575.313181 |
|---|---|
| Coefficient of variation (CV) | 0.693376135 |
| Kurtosis | 0.8157469837 |
| Mean | 12367.47668 |
| Median Absolute Deviation (MAD) | 4950 |
| Skewness | 1.165281363 |
| Sum | 878029007 |
| Variance | 73535996.16 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 6995 | 912 | 1.3% | |
| 7995 | 892 | 1.3% | |
| 8995 | 884 | 1.2% | |
| 5995 | 829 | 1.2% | |
| 4500 | 779 | 1.1% | |
| 3500 | 768 | 1.1% | |
| 9995 | 754 | 1.1% | |
| 5500 | 745 | 1.0% | |
| 6500 | 738 | 1.0% | |
| 4995 | 692 | 1.0% | |
| Other values (3785) | 63002 | 88.7% |
| Value | Count | Frequency (%) | |
| 1050 | 4 | < 0.1% | |
| 1095 | 1 | < 0.1% | |
| 1099 | 1 | < 0.1% | |
| 1100 | 21 | < 0.1% | |
| 1111 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 39999 | 17 | < 0.1% | |
| 39998 | 3 | < 0.1% | |
| 39997 | 19 | < 0.1% | |
| 39995 | 58 | 0.1% | |
| 39994 | 1 | < 0.1% |
year
Real number (ℝ≥0)
| Distinct | 21 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2010.758687 |
|---|---|
| Minimum | 2001 |
| Maximum | 2021 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 554.6 KiB |
Quantile statistics
| Minimum | 2001 |
|---|---|
| 5-th percentile | 2003 |
| Q1 | 2007 |
| median | 2011 |
| Q3 | 2014 |
| 95-th percentile | 2018 |
| Maximum | 2021 |
| Range | 20 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 4.695733948 |
|---|---|
| Coefficient of variation (CV) | 0.002335304568 |
| Kurtosis | -0.9003727376 |
| Mean | 2010.758687 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | -0.1536677832 |
| Sum | 142753813 |
| Variance | 22.04991731 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=21)
| Value | Count | Frequency (%) | |
| 2013 | 5512 | 7.8% | |
| 2011 | 5293 | 7.5% | |
| 2012 | 5135 | 7.2% | |
| 2014 | 5014 | 7.1% | |
| 2008 | 4985 | 7.0% | |
| 2015 | 4733 | 6.7% | |
| 2007 | 4436 | 6.2% | |
| 2017 | 4334 | 6.1% | |
| 2010 | 4219 | 5.9% | |
| 2016 | 3975 | 5.6% | |
| Other values (11) | 23359 | 32.9% |
| Value | Count | Frequency (%) | |
| 2001 | 1290 | 1.8% | |
| 2002 | 1726 | 2.4% | |
| 2003 | 2154 | 3.0% | |
| 2004 | 2832 | 4.0% | |
| 2005 | 3249 | 4.6% |
| Value | Count | Frequency (%) | |
| 2021 | 6 | < 0.1% | |
| 2020 | 407 | 0.6% | |
| 2019 | 1884 | 2.7% | |
| 2018 | 2383 | 3.4% | |
| 2017 | 4334 | 6.1% |
| Distinct | 39 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18.9175294 |
|---|---|
| Minimum | 0 |
| Maximum | 41 |
| Zeros | 747 |
| Zeros (%) | 1.1% |
| Memory size | 554.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 10 |
| median | 16 |
| Q3 | 31 |
| 95-th percentile | 39 |
| Maximum | 41 |
| Range | 41 |
| Interquartile range (IQR) | 21 |
Descriptive statistics
| Standard deviation | 11.56409725 |
|---|---|
| Coefficient of variation (CV) | 0.6112900369 |
| Kurtosis | -0.9902455298 |
| Mean | 18.9175294 |
| Median Absolute Deviation (MAD) | 9 |
| Skewness | 0.5631120537 |
| Sum | 1343050 |
| Variance | 133.7283451 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=39)
| Value | Count | Frequency (%) | |
| 13 | 12778 | 18.0% | |
| 7 | 10105 | 14.2% | |
| 39 | 6085 | 8.6% | |
| 16 | 4845 | 6.8% | |
| 31 | 4337 | 6.1% | |
| 20 | 3095 | 4.4% | |
| 10 | 2787 | 3.9% | |
| 14 | 2563 | 3.6% | |
| 37 | 2140 | 3.0% | |
| 17 | 2128 | 3.0% | |
| Other values (29) | 20132 | 28.4% |
| Value | Count | Frequency (%) | |
| 0 | 747 | 1.1% | |
| 1 | 6 | < 0.1% | |
| 2 | 3 | < 0.1% | |
| 3 | 784 | 1.1% | |
| 4 | 1908 | 2.7% |
| Value | Count | Frequency (%) | |
| 41 | 558 | 0.8% | |
| 40 | 1814 | 2.6% | |
| 39 | 6085 | 8.6% | |
| 38 | 8 | < 0.1% | |
| 37 | 2140 | 3.0% |
model
Real number (ℝ≥0)
| Distinct | 8075 |
|---|---|
| Distinct (%) | 11.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5119.081414 |
|---|---|
| Minimum | 0 |
| Maximum | 10038 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 554.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 664 |
| Q1 | 2792 |
| median | 5133 |
| Q3 | 7623 |
| 95-th percentile | 9452 |
| Maximum | 10038 |
| Range | 10038 |
| Interquartile range (IQR) | 4831 |
Descriptive statistics
| Standard deviation | 2765.129139 |
|---|---|
| Coefficient of variation (CV) | 0.5401611961 |
| Kurtosis | -1.124779596 |
| Mean | 5119.081414 |
| Median Absolute Deviation (MAD) | 2361 |
| Skewness | -0.02279244518 |
| Sum | 363429185 |
| Variance | 7645939.158 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 3936 | 1044 | 1.5% | |
| 8065 | 953 | 1.3% | |
| 3634 | 620 | 0.9% | |
| 1273 | 571 | 0.8% | |
| 2010 | 561 | 0.8% | |
| 56 | 556 | 0.8% | |
| 1163 | 534 | 0.8% | |
| 2295 | 480 | 0.7% | |
| 3785 | 472 | 0.7% | |
| 4651 | 437 | 0.6% | |
| Other values (8065) | 64767 | 91.2% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 5 | 1 | < 0.1% | |
| 6 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 10038 | 1 | < 0.1% | |
| 10037 | 1 | < 0.1% | |
| 10034 | 1 | < 0.1% | |
| 10033 | 6 | < 0.1% | |
| 10032 | 1 | < 0.1% |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.101246567 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 35365 |
| Zeros (%) | 49.8% |
| Memory size | 554.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 3 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.159579255 |
|---|---|
| Coefficient of variation (CV) | 1.052969689 |
| Kurtosis | -1.363802462 |
| Mean | 1.101246567 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.3314757873 |
| Sum | 78183 |
| Variance | 1.344624048 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=6)
| Value | Count | Frequency (%) | |
| 0 | 35365 | 49.8% | |
| 2 | 25768 | 36.3% | |
| 3 | 7848 | 11.1% | |
| 1 | 1684 | 2.4% | |
| 4 | 231 | 0.3% | |
| 5 | 99 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 35365 | 49.8% | |
| 1 | 1684 | 2.4% | |
| 2 | 25768 | 36.3% | |
| 3 | 7848 | 11.1% | |
| 4 | 231 | 0.3% |
| Value | Count | Frequency (%) | |
| 5 | 99 | 0.1% | |
| 4 | 231 | 0.3% | |
| 3 | 7848 | 11.1% | |
| 2 | 25768 | 36.3% | |
| 1 | 1684 | 2.4% |
cylinders
Real number (ℝ≥0)
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.436227903 |
|---|---|
| Minimum | 0 |
| Maximum | 7 |
| Zeros | 305 |
| Zeros (%) | 0.4% |
| Memory size | 554.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 3 |
| median | 5 |
| Q3 | 5 |
| 95-th percentile | 6 |
| Maximum | 7 |
| Range | 7 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.256988924 |
|---|---|
| Coefficient of variation (CV) | 0.2833463364 |
| Kurtosis | -1.04221849 |
| Mean | 4.436227903 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.2797058224 |
| Sum | 314950 |
| Variance | 1.580021154 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=8)
| Value | Count | Frequency (%) | |
| 3 | 26987 | 38.0% | |
| 5 | 26299 | 37.0% | |
| 6 | 16377 | 23.1% | |
| 4 | 730 | 1.0% | |
| 0 | 305 | 0.4% | |
| 7 | 147 | 0.2% | |
| 2 | 133 | 0.2% | |
| 1 | 17 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 305 | 0.4% | |
| 1 | 17 | < 0.1% | |
| 2 | 133 | 0.2% | |
| 3 | 26987 | 38.0% | |
| 4 | 730 | 1.0% |
| Value | Count | Frequency (%) | |
| 7 | 147 | 0.2% | |
| 6 | 16377 | 23.1% | |
| 5 | 26299 | 37.0% | |
| 4 | 730 | 1.0% | |
| 3 | 26987 | 38.0% |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.934023523 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 3457 |
| Zeros (%) | 4.9% |
| Memory size | 554.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 2 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.4923988553 |
|---|---|
| Coefficient of variation (CV) | 0.2545981729 |
| Kurtosis | 11.70410423 |
| Mean | 1.934023523 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -2.14675949 |
| Sum | 137306 |
| Variance | 0.2424566327 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=5)
| Value | Count | Frequency (%) | |
| 2 | 65746 | 92.6% | |
| 0 | 3457 | 4.9% | |
| 3 | 1060 | 1.5% | |
| 4 | 634 | 0.9% | |
| 1 | 98 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 3457 | 4.9% | |
| 1 | 98 | 0.1% | |
| 2 | 65746 | 92.6% | |
| 3 | 1060 | 1.5% | |
| 4 | 634 | 0.9% |
| Value | Count | Frequency (%) | |
| 4 | 634 | 0.9% | |
| 3 | 1060 | 1.5% | |
| 2 | 65746 | 92.6% | |
| 1 | 98 | 0.1% | |
| 0 | 3457 | 4.9% |
| Distinct | 28733 |
|---|---|
| Distinct (%) | 40.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 114037.944 |
|---|---|
| Minimum | 0 |
| Maximum | 9400000 |
| Zeros | 208 |
| Zeros (%) | 0.3% |
| Memory size | 554.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 17581 |
| Q1 | 68690 |
| median | 109567 |
| Q3 | 150000 |
| 95-th percentile | 212907.5 |
| Maximum | 9400000 |
| Range | 9400000 |
| Interquartile range (IQR) | 81310 |
Descriptive statistics
| Standard deviation | 103329.2903 |
|---|---|
| Coefficient of variation (CV) | 0.9060956965 |
| Kurtosis | 2522.564326 |
| Mean | 114037.944 |
| Median Absolute Deviation (MAD) | 40695 |
| Skewness | 35.00114843 |
| Sum | 8096123835 |
| Variance | 1.067694223e+10 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 140000 | 275 | 0.4% | |
| 150000 | 269 | 0.4% | |
| 130000 | 268 | 0.4% | |
| 120000 | 247 | 0.3% | |
| 170000 | 242 | 0.3% | |
| 160000 | 232 | 0.3% | |
| 180000 | 222 | 0.3% | |
| 0 | 208 | 0.3% | |
| 145000 | 200 | 0.3% | |
| 135000 | 199 | 0.3% | |
| Other values (28723) | 68633 | 96.7% |
| Value | Count | Frequency (%) | |
| 0 | 208 | 0.3% | |
| 1 | 59 | 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 2 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 9400000 | 1 | < 0.1% | |
| 8888888 | 1 | < 0.1% | |
| 7465644 | 1 | < 0.1% | |
| 7441104 | 1 | < 0.1% | |
| 4900000 | 1 | < 0.1% |
transmission
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 554.6 KiB |
| 0 | |
|---|---|
| 1 | 4292 |
| 2 | 1849 |
| Value | Count | Frequency (%) | |
| 0 | 64854 | 91.4% | |
| 1 | 4292 | 6.0% | |
| 2 | 1849 | 2.6% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
drive
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 554.6 KiB |
| 1 | |
|---|---|
| 0 | |
| 2 |
| Value | Count | Frequency (%) | |
| 1 | 29964 | 42.2% | |
| 0 | 28331 | 39.9% | |
| 2 | 12700 | 17.9% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.907303331 |
|---|---|
| Minimum | 0 |
| Maximum | 12 |
| Zeros | 19443 |
| Zeros (%) | 27.4% |
| Memory size | 554.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 8 |
| Q3 | 9 |
| 95-th percentile | 11 |
| Maximum | 12 |
| Range | 12 |
| Interquartile range (IQR) | 9 |
Descriptive statistics
| Standard deviation | 4.235857361 |
|---|---|
| Coefficient of variation (CV) | 0.7170543179 |
| Kurtosis | -1.515071859 |
| Mean | 5.907303331 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.3816937941 |
| Sum | 419389 |
| Variance | 17.94248758 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=13)
| Value | Count | Frequency (%) | |
| 9 | 21106 | 29.7% | |
| 0 | 19443 | 27.4% | |
| 10 | 8272 | 11.7% | |
| 8 | 5186 | 7.3% | |
| 3 | 3656 | 5.1% | |
| 4 | 3277 | 4.6% | |
| 11 | 2787 | 3.9% | |
| 12 | 2431 | 3.4% | |
| 5 | 2339 | 3.3% | |
| 2 | 1466 | 2.1% | |
| Other values (3) | 1032 | 1.5% |
| Value | Count | Frequency (%) | |
| 0 | 19443 | 27.4% | |
| 1 | 64 | 0.1% | |
| 2 | 1466 | 2.1% | |
| 3 | 3656 | 5.1% | |
| 4 | 3277 | 4.6% |
| Value | Count | Frequency (%) | |
| 12 | 2431 | 3.4% | |
| 11 | 2787 | 3.9% | |
| 10 | 8272 | 11.7% | |
| 9 | 21106 | 29.7% | |
| 8 | 5186 | 7.3% |
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.653623495 |
|---|---|
| Minimum | 0 |
| Maximum | 11 |
| Zeros | 13268 |
| Zeros (%) | 18.7% |
| Memory size | 554.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 8 |
| Q3 | 9 |
| 95-th percentile | 10 |
| Maximum | 11 |
| Range | 11 |
| Interquartile range (IQR) | 8 |
Descriptive statistics
| Standard deviation | 3.991704175 |
|---|---|
| Coefficient of variation (CV) | 0.7060435097 |
| Kurtosis | -1.580678747 |
| Mean | 5.653623495 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.2968371636 |
| Sum | 401379 |
| Variance | 15.93370222 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=12)
| Value | Count | Frequency (%) | |
| 10 | 16772 | 23.6% | |
| 0 | 13268 | 18.7% | |
| 9 | 11462 | 16.1% | |
| 5 | 8747 | 12.3% | |
| 1 | 7717 | 10.9% | |
| 8 | 6925 | 9.8% | |
| 2 | 1828 | 2.6% | |
| 4 | 1700 | 2.4% | |
| 3 | 1555 | 2.2% | |
| 11 | 437 | 0.6% | |
| Other values (2) | 584 | 0.8% |
| Value | Count | Frequency (%) | |
| 0 | 13268 | 18.7% | |
| 1 | 7717 | 10.9% | |
| 2 | 1828 | 2.6% | |
| 3 | 1555 | 2.2% | |
| 4 | 1700 | 2.4% |
| Value | Count | Frequency (%) | |
| 11 | 437 | 0.6% | |
| 10 | 16772 | 23.6% | |
| 9 | 11462 | 16.1% | |
| 8 | 6925 | 9.8% | |
| 7 | 217 | 0.3% |
state
Real number (ℝ≥0)
| Distinct | 51 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 24.7617297 |
|---|---|
| Minimum | 0 |
| Maximum | 50 |
| Zeros | 656 |
| Zeros (%) | 0.9% |
| Memory size | 554.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 12 |
| median | 24 |
| Q3 | 37 |
| 95-th percentile | 48 |
| Maximum | 50 |
| Range | 50 |
| Interquartile range (IQR) | 25 |
Descriptive statistics
| Standard deviation | 14.62857776 |
|---|---|
| Coefficient of variation (CV) | 0.590773663 |
| Kurtosis | -1.319492316 |
| Mean | 24.7617297 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | -0.03124648915 |
| Sum | 1757959 |
| Variance | 213.9952872 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 4 | 5188 | 7.3% | |
| 9 | 4975 | 7.0% | |
| 34 | 4316 | 6.1% | |
| 35 | 3444 | 4.9% | |
| 48 | 3233 | 4.6% | |
| 22 | 3209 | 4.5% | |
| 43 | 2884 | 4.1% | |
| 38 | 2585 | 3.6% | |
| 27 | 2514 | 3.5% | |
| 45 | 2443 | 3.4% | |
| Other values (41) | 36204 | 51.0% |
| Value | Count | Frequency (%) | |
| 0 | 656 | 0.9% | |
| 1 | 1083 | 1.5% | |
| 2 | 523 | 0.7% | |
| 3 | 894 | 1.3% | |
| 4 | 5188 | 7.3% |
| Value | Count | Frequency (%) | |
| 50 | 206 | 0.3% | |
| 49 | 112 | 0.2% | |
| 48 | 3233 | 4.6% | |
| 47 | 711 | 1.0% | |
| 46 | 974 | 1.4% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | price | year | manufacturer | model | condition | cylinders | fuel | odometer | transmission | drive | type | paint_color | state | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 16995 | 2007 | 14 | 8026 | 2 | 6 | 0 | 254217 | 0 | 0 | 10 | 10 | 23 |
| 1 | 1 | 13995 | 2012 | 13 | 3936 | 2 | 5 | 2 | 188406 | 0 | 0 | 10 | 5 | 23 |
| 2 | 2 | 7995 | 2010 | 7 | 3552 | 2 | 3 | 2 | 108124 | 0 | 0 | 0 | 5 | 23 |
| 3 | 3 | 8995 | 2011 | 7 | 9197 | 2 | 5 | 2 | 178054 | 0 | 0 | 0 | 10 | 23 |
| 4 | 4 | 10995 | 2014 | 13 | 3785 | 2 | 5 | 2 | 170259 | 0 | 0 | 0 | 10 | 23 |
| 5 | 5 | 12995 | 2004 | 34 | 272 | 2 | 5 | 0 | 309621 | 0 | 0 | 10 | 3 | 23 |
| 6 | 6 | 10995 | 2011 | 7 | 8065 | 2 | 6 | 2 | 210865 | 0 | 0 | 10 | 9 | 23 |
| 7 | 7 | 12450 | 2011 | 7 | 8065 | 2 | 6 | 2 | 150959 | 0 | 0 | 10 | 1 | 23 |
| 8 | 8 | 15995 | 2011 | 7 | 8136 | 2 | 6 | 2 | 223470 | 0 | 0 | 10 | 10 | 23 |
| 9 | 9 | 11995 | 2009 | 7 | 8065 | 2 | 6 | 2 | 170684 | 0 | 0 | 10 | 1 | 23 |
Last rows
| df_index | price | year | manufacturer | model | condition | cylinders | fuel | odometer | transmission | drive | type | paint_color | state | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 70985 | 85308 | 14583 | 2017 | 31 | 5669 | 0 | 3 | 2 | 17935 | 1 | 1 | 0 | 10 | 23 |
| 70986 | 85309 | 8000 | 2008 | 4 | 590 | 2 | 5 | 2 | 100685 | 0 | 0 | 9 | 10 | 23 |
| 70987 | 85311 | 12000 | 2013 | 39 | 7011 | 2 | 3 | 3 | 69880 | 0 | 1 | 4 | 1 | 23 |
| 70988 | 85312 | 5700 | 2014 | 17 | 3429 | 4 | 3 | 2 | 119000 | 0 | 1 | 9 | 10 | 32 |
| 70989 | 85313 | 29500 | 2015 | 39 | 8740 | 3 | 5 | 2 | 75000 | 0 | 0 | 8 | 1 | 34 |
| 70990 | 85314 | 1600 | 2004 | 41 | 9796 | 0 | 4 | 2 | 292255 | 0 | 0 | 12 | 1 | 34 |
| 70991 | 85315 | 9885 | 2012 | 37 | 4733 | 0 | 3 | 2 | 82000 | 0 | 0 | 4 | 9 | 6 |
| 70992 | 85317 | 4800 | 2002 | 13 | 6439 | 2 | 5 | 2 | 58000 | 0 | 2 | 3 | 1 | 32 |
| 70993 | 85318 | 1600 | 2006 | 17 | 8316 | 1 | 5 | 2 | 159980 | 0 | 1 | 9 | 1 | 23 |
| 70994 | 85319 | 9000 | 2003 | 39 | 7841 | 0 | 6 | 2 | 160000 | 0 | 0 | 0 | 4 | 23 |